Agent API: JSON actions, observations, phase flow#1
Conversation
Add JSON-serializable action and observation interfaces for agents: - get_available_action_dicts(): Returns list of ActionDict with id, type, label, params, phase - take_action_dict(action): Executes action dict and returns ActionResult - get_observation(): Returns complete observable game state as ObservationDict Action types implemented: - Combat: play_card, use_potion, end_turn - Map: path_choice - Events: event_choice, neow_choice - Rewards: pick_card, skip_card, singing_bowl, claim_gold/potion/relic, etc. - Shop: buy_card, buy_relic, buy_potion, remove_card, leave_shop - Rest: rest, smith, dig, lift, toke, recall - Treasure: take_relic, sapphire_key, leave_treasure - Boss: pick_boss_relic, skip_boss_relic Observation includes: - run: seed, ascension, act, floor, gold, hp, deck, relics, potions, keys - map: nodes, edges, available_paths, visited_nodes - combat: player, energy, stance, hand, draw_pile, discard_pile, enemies - event: event_id, phase, choices - reward: gold, potion, card_rewards, relic, boss_relics - shop: colored_cards, colorless_cards, relics, potions, purge_cost - rest: available_actions Also fixes conftest.py to use relative paths for worktree compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| "label": "Leave", | ||
| "params": {}, | ||
| "phase": "treasure", | ||
| }) |
There was a problem hiding this comment.
Agent API adds invalid leave_treasure action not in game
Medium Severity
The generate_treasure_actions function adds a leave_treasure action that doesn't exist in the original game logic. In Slay the Spire, treasure chests require either taking the relic or (in Act 3) taking the sapphire key - you cannot simply leave without taking anything. The original _get_treasure_actions() in game.py only provides take_relic and sapphire_key options. This introduces non-standard game behavior that could lead to RL agents learning invalid strategies.
Additional Locations (1)
| "label": "Skip boss relic", | ||
| "params": {}, | ||
| "phase": "boss_reward", | ||
| }) |
There was a problem hiding this comment.
Agent API adds invalid skip_boss_relic action not in game
Medium Severity
The generate_boss_reward_actions function adds a skip_boss_relic action that doesn't exist in the original game. In Slay the Spire, after defeating a boss you must choose one of the three offered boss relics - skipping is not an option. The original _get_boss_reward_actions() in game.py only returns BossRewardAction(i) for each relic without a skip option. This introduces non-standard game behavior.
Additional Locations (1)
| available.append("lift") | ||
|
|
||
| if runner.run_state.has_relic("Peace Pipe"): | ||
| available.append("toke") |
There was a problem hiding this comment.
Rest observation says toke available without checking removable cards
Low Severity
The generate_rest_observation function adds "toke" to available_actions when the player has the Peace Pipe relic, without checking if there are actually removable cards. However, generate_rest_actions only generates toke actions if get_removable_cards() returns cards. This inconsistency means the observation might indicate "toke" is available when no actual toke actions exist. While decks are rarely empty in practice, this creates an inconsistent API contract.


Summary
get_available_action_dicts()returning JSON actionstake_action_dict()with validationget_observation()with full schemaTest Results
31 tests passing in test_agent_api.py
Files Changed
🤖 Generated with Claude Code
Note
Medium Risk
Adds a large, side-effecting
agent_apimodule that monkey-patchesGameRunnerand introduces new code paths for phase transitions (e.g., boss relic skipping/treasure leaving). While mostly additive, mistakes could affect run flow or determinism for agent integrations.Overview
Adds a new
agent_apisurface that exposesGameRunner.get_available_action_dicts(),GameRunner.take_action_dict(), andGameRunner.get_observation()for RL agents using JSON-serializable actions/observations across phases (Neow, map navigation, combat, rewards, events, shop, rest, treasure, boss rewards).take_action_dict()maps action dicts into existing engine actions and includes explicit handling for special transitions likeskip_boss_relicandleave_treasure. The engine package now auto-importsagent_api(auto-patchingGameRunner) and exports the new TypedDict types.Adds a comprehensive
tests/test_agent_api.pysuite covering action generation/execution, observation schema/JSON-serializability, phase transitions, and determinism, and updatestests/conftest.pyto use a repo-relative path setup.Written by Cursor Bugbot for commit 61024cf. This will update automatically on new commits. Configure here.